F 분포

student-t 분포와 카이 제곱 분포는 가우시안 정규 분포를 따르는 하나의 확률 변수 $X$ 의 $n$개의 샘플로부터 생성할 수 있다.

[[school_notebook:8956e37db86c44b3b1b3a4c3357e590c]]

[[school_notebook:683cfb97b17041f3a9a0e6cbee5f1fef]]

이와 비슷하게 F 분포도 카이 제곱 분포를 따르는 독립적인 두 개의 확률 변수 $\chi^2_1(n_1)$와 $\chi^2_2(n_2)$의 확률 변수 샘플로부터 생성할 수 있다. 두 카이 제곱 분포의 샘플을 각각 $x_1$, $x_2$이라고 할 때 이를 각각 $n_1$, $n_2$로 나누어 그 비율을 구하면 $F(n_1, n_2)$ 분포가 된다. $n_1$, $n_2$는 F 분포의 자유도 인수이다.

$$ \dfrac{x_1 / n_1}{x_2/ n_2} \sim F(n_1, n_2) $$

F 분포의 확률 밀도 함수는 다음과 같이 정의된다.

$$ f(x; n_1,n_2) = \dfrac{\sqrt{\dfrac{(n_1\,x)^{n_1}\,\,n_2^{n_2}} {(n_1\,x+n_2)^{n_1+n_2}}}} {x\,\text{Beta}\!\left(\frac{n_1}{2},\frac{n_2}{2}\right)} $$

SciPy stats 서브패키지의 f 클래스는 F 분포를 지원한다.



In [2]:

    
xx = np.linspace(0.03, 3, 1000)
plt.hold(True)
plt.plot(xx, sp.stats.f(1,1).pdf(xx), label="F(1,1)")
plt.plot(xx, sp.stats.f(2,1).pdf(xx), label="F(2,1)")
plt.plot(xx, sp.stats.f(5,2).pdf(xx), label="F(5,2)")
plt.plot(xx, sp.stats.f(10,1).pdf(xx), label="F(10,1)")
plt.plot(xx, sp.stats.f(20,20).pdf(xx), label="F(20,20)")
plt.legend()
plt.show()